


lkmlşkmşlkmşlkmşlkmşkl

by yokdahaneler



Category: None - Fandom
Genre: None - Freeform, Other
Language: English
Status: Completed
Published: 2019-03-19
Updated: 2019-03-19
Packaged: 2019-11-24 20:31:48
Rating: Not Rated
Warnings: Creator Chose Not To Use Archive Warnings, No Archive Warnings Apply
Chapters: 4
Words: 3,955
Publisher: archiveofourown.org
Story URL: https://archiveofourown.org/works/18169583
Author URL: https://archiveofourown.org/users/yokdahaneler/pseuds/yokdahaneler
Summary: mlknlnklşknlşnklşk





	1. Chapter 1

1\. Introduction  
While the original aim of the MT pioneers was fully automatic MT systems, there has also been, at least since the 1966 ALPAC report and possibly even earlier, the view that computers could be used to help humans in their translation task rather than to replace them. In this chapter we will look at a range of computer-based tools that have been developed or proposed which can help translators. As we will see, ideas along these lines date back to the 1960s, even when access to computers was not particularly easy to obtain, nor were they especially efficient. In more recent times the ready availability of PCs, as well as the existence and growth of the Internet, could be said to have revolutionised the job of the translator.

 

The translation activities we will be discussing in this chapter can be broadly classified as Computer-Aided Translation (CAT), though often a finer distinction is made between Machine-Aided Human Translation (MAHT) and Human-Aided Machine Translation (HAMT)1 implying a distinction between a basically human activity involving computer-based tools on the one hand, and a computer-driven activity requiring the assistance of a human operator. The distinction may be useful at times, though it involves a degree of fuzziness at the edges which should not concern us. Nevertheless, the terminology suggests a spectrum of modes of operation in which the computer plays a progressively bigger part, which can usefully dictate the order of presentation of topics in this chapter.

 

2\. Historical sketch  
The idea for a translator’s workstation (or “workbench”) is often attributed to Martin Kay, who in 1980 wrote a highly influential memo “The Proper Place of Men and Machines in Language Translation”. However, many of the ideas expressed by Kay had already been hinted at, or even implemented in admittedly crude systems. In 1966, the ALPAC report — (in)famous for its criticism of attempts at fully automatic MT — recommended among other things the development of computer-based aids for translators. Even before the ALPAC report, the German Federal Armed Forces Translation Agency (the Bundessprachenamt) used computers to produce text-oriented glossaries, i.e. lists of technical terms and their approved translations based on a given source text. 

Next came facilities for online access to multilingual term banks such as Eurodicautom in the CEC and Termium in Canada, and programs for terminology management by individual translators. In the late 1970s we also ªnd the ªrst proposal for what is now called translation memory, in which previous translations are stored in the computer and retrieved as a function of their similarity to the current text being translated.2

As computational linguistic techniques were developed throughout the 1980s, Alan Melby was prominent in proposing the integration of various tools into a translator’s workstation at various levels: the first level would be basic word-processing, telecommunications and terminology management tools; the second level would include a degree of automatic dictionary look-up and access to translation memory; and the third would involve more sophisticated translation tools, up to and including fully automatic MT. Into the 1990s and the present day, commercial MT and CAT packages begin to appear on the market, incorporating many of these ideas. And as translators become more computer literate, we see them constructing their own “workstations” as they come to see translation-relevant uses for some of the facilities that are in any case part of the PC.

 

3\. Basic tools  
Let us start at the most basic level of computer use by translators. Although probably taken for granted by most translators, word processing software is an essential basic computational tool. Modern word processors include many useful facilities such as a word-count, a spell-checker, a thesaurus (in the popular sense of “synonym list”) and — of more dubious use to a translator — grammar and style checkers. Most of these functions are available with most well-known word-processing software  
packages, though we should note the extent to which all of them are highly language-dependent and language-specific.

 

No problem if we are working into one of the major commercially interesting languages (major European languages — English, French, Spanish, German, Italian, Portuguese, Russian — plus Japanese, Chinese, Korean and to a certain extent Arabic), but simple resources such as those just mentioned may not be available for other “minority” languages, or may be of inferior quality. In fact, for some languages such tools may not even be appropriate. For a language which uses a non-alphabetic writing system, like Japanese or Chinese, there isn’t really any concept of “spelling” to be corrected by a spell-checker (though of course there are other functions that a word-processor must provide, notably a means of inputting the characters in the first place).

 

On the subject of writing systems, thankfully much progress has been made in recent years to ensure that the scripts used for most of the world’s languages are actually available in word-processors. The Unicode consortium has made efforts to provide a standardised coding for multiple character sets, ensuring unique character codes which enable texts with mixtures of writing systems to be edited and printed and so on. Nevertheless, some problems remain, especially where languages use local variants of a more established writing system (diacritics seem to be a perennial problem), and certainly for many writing systems there is nothing like the range of fonts and type-faces that are available for the Roman alphabet.

 

Translations need to be revised, and the editing tools that word-processing packages provide are of course very useful. Although not yet commercially available, there has been talk amongst language engineering researchers and developers about the possibility, in the context of a translator’s workstation, of translator-oriented or linguistically sophisticated editing tools: a “translator-friendly” word-processor. Here is envisaged software with the normal word-processing facilities enhanced to facilitate the sort of text editing “moves” that a translator (or, perhaps, a translator working as a post-editor on some MT output) commonly makes.

 

Simple things like transposing two words at the touch of a function key are easy to imagine, but the software could incorporate more linguistically sophisticated tools such as “grammar-conscious global replace” in which the word-processing software was linguistically aware enough to recognize inflected variants of the word and change them accordingly, for example globally changing purchase to buy and getting “for free” purchasing → buying despite the missing e, and purchased → bought. With some “knowledge” of grammar, the word-processor could take care of grammatical consequences of revisions. For example, if you had a text in which the word fog had been translated as brouillard in French, but you decided brume was a better translation, you would have to do more than globally change brouillard to brume: brouillard is masculine, while brume is feminine, so some other changes (gender of adjectives and pronouns) may have to be made. You might want to replace look for with seek, and be hampered by the fact that the word for will not necessarily occur right next to the work look. 

 

The translator-friendly word-processor could also search for “false friends” (e.g. librairie as a translation of library — see also below) and other “interference” errors, if the user is a competent but not fluent writer of the target language. It might also recognize mixed-language texts and operate the appropriate spell-checker on different portions of text (this paragraph for example contains some French words which my spell-checker is marking as possible misspellings!) Unfortunately, none of these features are as yet found in currently available word-processing software, and it should be clear to the reader that to incorporate them would involve knowledge of grammar and vocabulary, and the ability to analyse the text not unlike that needed to do MT.


	2. fdhfdhfdjfdj

**Summary for the Chapter:**

> fdjhdfjdfjdfjdfj

4\. Dictation tools

One technology that is up-and-coming and of interest to many translators is dictation tools. As an alternative to typing in their translations, translators are discovering that dictating their draft translation into the computer using speech recognition systems can be a great boost to productivity.

This gain is due not only to the obvious fact that most people can talk faster than they can type, but to other “hidden” advantages. Michael Benis has suggested that translators are less likely to come out with a clumsy or inelegant construction if they actually have to say it out loud.

Typographical errors are also reduced, since dictation software does not insert words that are not found in its dictionary — though of course they may not be the correct words, due to the limitations of the software. There is even a gain from the health point of view, since dictation systems allow the translator to get away from the confines of the keyboard, mouse and screen environment responsible for well-documented industrial illnesses.

 

Dictation systems are not without their drawbacks however. The technology is still in its infancy, and can make annoying mistakes just because of the inherent difficulties of speech recognition. The user must speak more clearly and slowly than may be natural, and you can expect the system to be confused by homophones (words which sound identical) and even similar-sounding words.

 

Most systems work on the basis of “trigrams”, or sequences of three words, and include extensive statistics on the probability of word sequences. For example, if you dictated the sentences in (1) you could expect the system to get the correct homophone rode or rowed, because the disambiguating word bike or boat is within three words. But in a case like (2) it might not get it right.

(1) I rode the bike. I rowed the boat.

(2) This {boat, bike} is just like one that I sometimes {rode, rowed}.

Basic errors such as confusing there and their can nevertheless still be expected, while continuous speech has an unexpected effect on words which might be easy to identify when spoken in isolation. Try reading the following examples aloud to see what the text probably should be.  
(3) It’s hard to wreck a nice beach.

(4) What dime’s the neck strain to stop port?

All individuals have slightly different speech patterns, on top of the fact that regional accents can vary hugely. In fact, your own speech can vary significantly from occasion to occasion, for example if you have a cold, are tired, excited and so on. 

For this reason, dictation systems usually have to be trained to recognize the idiosyncrasies in your speech, so that when you first install dictation software, there will be a more or less lengthy training (or “enrolment”) period in which you are asked to read some “scripts” designed to sample a range of speech sounds from which the system can learn your individual phoneme system. In addition, you can “teach” the system additional vocabulary from your own field, by training it on texts of your choice.

Once they have been trained, another way to improve performance is to use the system’s correction utility. Many systems include this ability to learn from your corrections so as not to make the same mistake again. If using this feature, it is important to distinguish between correcting errors, and changing your mind about what you want to say. If you decided to change help to assist, you would not want the system to “learn” that the spoken word [h7lp] is written a-s-s-i-s-t.

 

Of course, like most language-related software products, dictation systems are highly language-specific, and as with many such products, you will find a large choice of products for English and the other major European languages, but if you are working into any other languages it may be much harder to find a suitable product.

 

5\. Information technology

A further basic and important element in a translator’s workstation comes under the general heading of information technology. Many translators nowadays receive and send their work directly in computer-compatible form. Diskettes and writable CDs are excellent media for receiving, sending and storing large amounts of textual — and other — material. Equally, telecommunications are playing an increasing role, whereby translators receive and send material via phone-lines in the form of faxes and e-mail attachments.

 

What will happen to the translated text is often a concern of the translator, and so desktop publishing software might in some sense be part of the translator’s workstation. Formatting that needs to be preserved from the source-text can easily be copied over to the target text (in fact translators may simply copy the file and overwrite it with the translation).

Text which contains graphics which in turn contain text which must be translated is no longer the printer’s nightmare that it might once have been, if the translator has access to the same graphics package as was used to draw the original diagram, and in which text boxes can be simply substituted.

 

It should be said that the apparent advantages of using this technology can evaporate if for example the source text is badly formatted (with spaces instead of tabs in tables, linefeeds used to force page-breaks, and so on), and some translators may prefer to restrict their work to translating, and leave lay-out and formatting to the experts.

 

Another recent development in the world of computer-based text handling is the use of mark-up languages. The idea is that texts can contain “hidden” markers or tags to indicate structural aspects of the text, which in turn can be interpreted for formatting. Users of word-processors will be familiar with the concept of style templates, which are similar: by marking, say, a section header explicitly as such one can define separately the formatting associated with the tag, and this can be easily changed for the whole document if necessary.

Efforts have been made to standardize the way in which this mark-up is used, and the Standard Generalized Mark-up Language SGML is widely used. If you look at the “page source” of a web page, you will see HTML, which is very similar: tags are seen as symbols within angle brackets, and generally come in pairs with the “closing” symbol the same as the “opening” symbol but preceded by a slash. 

For example, a level-3 heading might be indicated

### thus

. While SGML is widely used to define document structure and with it formatting conventions, it can also be used for a wide variety of other purposes, including annotating texts in many ways relevant to the translation process, for example, inserting codes identifying technical terms (and their translations), indicating grammatical information on ambiguous words, identifying the source of the translation of each sentence (human, MT system, translation memory, etc.), and any other commentary that the author or translator might wish to add, such as instructions to the printer, but which will not appear in the final document.


	3. fdhdfjfdjd

**Summary for the Chapter:**

> fdhdfhdfjfdj

6\. Lexical resources

Beyond word-processing and related tools, the translator’s workstation should facilitate access to an array of lexical resources, in particular online dictionaries and term banks. Online dictionaries may take the form of computer-accessible versions of traditional printed dictionaries, or may be specifically designed to work with other applications within the workstation. The online dictionaries may of coursebe mono-, bi- or multilingual.

 

The way in which information associated with each entry in the dictionary is presented may be under the control of the user: for example, the translator may or may not be interested in etymological information, pronunciation, examples of usage, related terms and so on.

 

Online dictionaries can be little more than an on-screen version of the printed text; or else they may take advantage of the flexible structure that a computer affords, with a hypertext format and flexible hierarchical structure, allowing the user to explore the resource at will via links to related entries.

 

The user may or not be allowed to edit the contents of the dictionary, adding and deleting information including entire entries. Where the online dictionary is also used to provide a draft translation, it is normal to allow the user to add entries.

 

An important resource for translators is of course technical terminology. Online access to term banks was one of the earliest envisaged CAT tools, and with the growth of the Internet the focus nowadays is on licensed access to centrally maintained terminology rather than local copies, although there is obviously also a place in the translator’s workstation to allow translators to maintain their own lists of terminology in a variety of formats.

 

It would be narrow-minded to assume that the only sources of information that a translator needs are collections of words: easy access to a wide range of other types of information can be part of the translator’s workstation: a gazetteer can be useful to check proper names, as can a list of company names. Encyclopedias and other general reference works are all useful resources for the translator, and all can be integrated into the translator’s workstation.

 

Of course, as many translators already know, resources such as these, and many more, are readily available on the World Wide Web, access to which would be an essential element of the translator’s workstation. 7. Features of typical commercial MT systems Software with some translation capability will be an integral part of the translator’s workstation. The most important feature of this is that it is under the user’s control.

 

In this section we will look at the typical commercial MT system and consider to what extent it can be used by a translator. The first thing to notice is that, rightly or wrongly, commercial MT systems are designed primarily with use by non-linguists in mind.

 

This is evident in the packaging, and in the wording of user manuals, like the following, regarding updating dictionary entries: Many users, still haunted by memories of grammar classes at school, will be daunted by this idea.

 

But with T1 there’s really no need to worry. Special user-friendly interfaces permit you to work in the lexicon with a minimum of knowledge and effort. (Langenscheidt’s T1 Professional, User Manual, p. 102; emphasis added) So, the question quickly arises:

Are these systems useful for real translators? Individuals should experiment with the less expensive systems (though bear in mind that cost and quality go hand in hand, and the very basic systems can easily give a bad impression of the slightly more expensive ones).

 

Let us consider what you are likely to get from an MT system, and how you might put it to use. The typical system presents itself as an extended word-processing system, with additional menus and toolbars for the translation-related functions.

Often, the suggested set-up has the source text shown in one window, with a second window for target text, in which the source text is initially displayed, to be over-written by the translation (Figure 7). Often, the user can customize the arrangement, for example to have source and target text side-by-side rather than vertically arranged, as shown. In its most simple mode of use, the user highlights a portion of text to be translated, as seen in Figure 7. The draft translation is then pasted in the appropriate place in the target text window, ready for post-editing.

 

Allowing the user to determine which portions of text should be sent to the MT system gives the user much more control over the process (although some systems will try to translate a whole sentence regardless of what text has been highlighted). If the user really can determine what text is to be translated, they will quickly learn to assess what types of text are likely to be translated well, and can develop a way of working with the system, translating more di¹cult sections immediately “by hand”, while allowing the system to translate the more straightforward parts.

 


	4. gfjfjgf

**Summary for the Chapter:**

> gfjgfkgfk

More sophisticated modes of operation are also possible. Most CAT systems allow the user to run a “new word” check on the source text, and then to update the system’s dictionary using the list of “unknown” words.

 

Many systems offer a choice of interactive translation in which the system stops to ask the user to make choices. Many CAT users however have suggested that this slows down the process, since the system repeatedly offers the same choices, asks “stupid” questions, and apparently never “remembers” a relevant choice made earlier in the translation of the same text (though to do so correctly would actually require some quite sophisticated software design).

 

Full word-processing facilities are of course available in the target-text window to facilitate post-editing. With many systems, the same is true of the source-text window, which simplifies the task of pre-editing, i.e. altering the source-text so as to give the MT system a chance of doing a better draft translation.

 

The juxtaposition of the two windows, and the ease of sentence-by-sentence translation suggests a novel method of trial-and-error computer-aided translation which has been called “post-editing the source text”. The idea behind this apparently counterintuitive activity is that the user can see what kind of errors the MT system makes, and can then change the source text in response to these errors.

 

Post-editing the source rather than the target text might involve the user in less work. For example, suppose we have a text containing a recipe in French: many of the sentences are instructions, expressed in the French infinitive form, as in (5). (5) Peler les pêches. Dénoyeauter. Couper les fruits en quartiers. to-peel the peaches. to-stone. to-cut the fruits into quarters Now let us suppose that, unfortunately, our MT system always seems to translate infinitives in this type of sentence using the -ing form of the verb, instead of the imperative (6).

 

(6) Peeling the peaches. Stoning. Cutting the fruit into quarters. Apart from the form of the verb, the translation is usable. But note, assuming that this error is repeated throughout a reasonably long text, the post-editing effort involved in correcting it. A simple search-and-replace deleting -ing will not work, because that would leave forms like *Ston and *Cutt.

 

An alternative is to edit the source-text, changing the infinitives ending in -er to imperatives ending in -ez. With a few exceptions this can be achieved by a (careful) global search-and-replace. Although it renders the source text less elegant (and one can give examples of similar fixes that actually make the source text ungrammatical), this does not matter, since the text in the source window can simply be a working copy of the original, and no one need see the cannibalised version.

 

8\. Other corpus-based resources A major interest of computational linguistics in recent years has been “corpus linguistics”. A corpus is a collection of text, usually stored in a computer-readable format. The example database of a translation memory is an example of a corpus, with the particularly interesting property of being an aligned parallel corpus, by which is meant that it represents texts which are translations of each other (“parallel”), and, crucially, the corpus has been subdivided into smaller fragments which correspond to each other (hence “aligned”).

 

This kind of corpus is an extremely useful resource for translators, and a number of tools can be built which make use of it. One of the most useful is the concordance, also sometimes known as a keyword in context (KWIC) list: it is a tool that literature scholars have used for many years.

 

This alternative name gives a clue as to what a concordance is, namely a list of occurrences of a given word, showing their context. Figure 9 shows an example of this, a list of all the occurrences of the word curious (or more accurately, the sequence of characters c-u-r-i-o-u-s) in Lewis Carroll’s famous book, Alice’s Adventures in Wonderland.

 

A listing such as this is of interest in itself since it shows the range of use of an individual word. For a translator, of more interest is a bilingual concordance, in which each line is linked to the corresponding translation. This enables the translator to see how a particular word — or more usefully a phrase or a technical term — has been translated before.


End file.
